Fast Exact Max-Kernel Search
نویسندگان
چکیده
The wide applicability of kernels makes the problem of max-kernel search ubiquitous and more general than the usual similarity search in metric spaces. We focus on solving this problem efficiently. We begin by characterizing the inherent hardness of the max-kernel search problem with a novel notion of directional concentration. Following that, we present a method to use an O(n log n) algorithm to index any set of objects (points in R or abstract objects) directly in the Hilbert space without any explicit feature representations of the objects in this space. We present the first provably O(log n) algorithm for exact max-kernel search using this index. Empirical results for a variety of data sets as well as abstract objects demonstrate up to 4 orders of magnitude speedup in some cases. Extensions for approximate max-kernel search are also presented.
منابع مشابه
Dual-tree fast exact max-kernel search
The problem of max-kernel search arises everywhere: given a query point pq , a set of reference objects Sr and some kernel K, find arg maxpr∈Sr K(pq , pr ). Max-kernel search is ubiquitous and appears in countless domains of science, thanks to the wide applicability of kernels. A few domains include image matching, information retrieval, bio-informatics, similarity search, and collaborative fil...
متن کاملFaster Dual-Tree Traversal for Nearest Neighbor Search
Nearest neighbor search is a nearly ubiquitous problem in computer science. When nearest neighbors are desired for a query set instead of a single query point, dual-tree algorithms often provide the fastest solution, especially in low-to-medium dimensions (i.e. up to a hundred or so), and can give exact results or absolute approximation guarantees, unlike hashing techniques. Using a recent deco...
متن کاملFast Inference and Learning with Sparse Belief Propagation
Even in trees, exact probabilistic inference can be expensive when the cardinality of the variables is large. This is especially troublesome for learning, because many standard estimation techniques, such as EM and conditional maximum likelihood, require calling an inference algorithm many times. In max-product inference, a standard heuristic for controlling this complexity in linear chains is ...
متن کاملLearning to speed up MAP decoding with column generation
In this paper, we show how the connections between max-product message passing for max-product and linear programming relaxations allow for a more efficient exact algorithm for the MAP problem. Our proposed algorithm uses column generation to pass messages only on a small subset of the possible assignments to each variable, while guaranteeing to find the exact solution. This algorithm is three ...
متن کاملTunable GMM Kernels
The recently proposed “generalized min-max” (GMM) kernel [9] can be efficiently linearized, with direct applications in large-scale statistical learning and fast near neighbor search. The linearized GMM kernel was extensively compared in [9] with linearized radial basis function (RBF) kernel. On a large number of classification tasks, the tuning-free GMM kernel performs (surprisingly) well comp...
متن کامل